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Abstract 

Four diverse tools built on the Annotation Graph Toolkit are described. Each tool associates linguistic codes and structures with time- 
series data. All are based on the same software library and tool architecture. TableTrans is for observational coding, using a spreadsheet 
whose rows are aligned to a signal. MultiTrans is for transcribing multi-party communicative interactions recorded using multi-channel 
signals. InterTrans is for creating interlinear text aligned to audio. TreeTrans is for creating and manipulating syntactic trees. This 
work demonstrates that the development of diverse tools and re-use of software components is greatly facilitated by a common high- 
level application programming interface for representing the data and managing input/output, together with a common architecture for 
managing the interaction of multiple components. 



1. Introduction 

Annotation graphs provide an efficient and expressive 
data model for linguistic annotations of time-series data. 
We have developed a complete open-source software in- 
frastructure supporting the rapid development of tools for 
transcribing and annotating time-series data, the Annota- 
tion Graph Toolkit (AGTK). This general-purpose infras- 
tructure uses annotation graphs as the underlying model 
and allows developers to quickly create special-purpose an- 
notation tools using common components. An application 
programming interface, an input/output library supporting 
a variety of annotation file formats, and several graphical 
user interfaces have been developed. 

This paper describes four annotation tools based on 
AGTK which have been developed at the Linguistic Data 
Consortium. TableTrans is for observational coding, using 
a spreadsheet whose rows are aligned to a signal. Multi- 
Trans is for transcribing multi-party communicative inter- 
actions recorded using multi-channel signals. InterTrans is 
for creating interlinear text aligned to audio. TreeTrans is 
for creating and manipulating syntactic trees. The toolkit 
and tools are distributed under an open source software li- 
cense. 

2. The Annotation Graph Toolkit (AGTK) 

An annotation graph ( [Bird and Liberman, 200l| ) is a di- 
rected acyclic graph where edges are labeled with fielded 
records, and nodes are (optionally) labeled with time off- 
sets. The annotation graph model, a generalization of the 



Tipster model used in text retrieval (Grishman, 1997), is ca 



pable of representing virtually all types of linguistic annota- 
tion (e.g. phonetic, orthographic, part-of-speech, syntactic, 
discourse, intonational). This development has opened up 
an interesting range of new possibilities for creation, main- 
tenance and search, and has lead to new annotation tools 
with applicability across the text, audio and video modali- 
ties. 

The Annotation Graph Toolkit (AGTK) is a collection 
of software for the development of annotation tools, instan- 
tiating the annotation graph model. AGTK includes ap- 



plication programming interfaces (APIs) for manipulating 
annotation graph data and importing data from other for- 
mats, wrappers for scripting languages, graphical user in- 
terface (GUI) components specialized for annotation tasks, 
and demonstration applications. 

Maeda et al. (2002| ) describes the toolkit in detail. Ma et 
al. (2002) describes collaborative annotation using a shared 
database server. 

3. Use of Third-party Software 

AGTK depends on a variety of third-party software, as 
detailed below. 

3.1. Scripting languages: Python and Tel 

AGTK provides interfaces to its libraries for both Tel 
and Python. Tel is a scripting language with a native GUI 
toolkit called Tk. Python is a newer object-oriented script- 
ing language. The early AGTK applications were devel- 
oped using Tel, but more recent applications have been de- 
veloped in Python. Both sets of applications depend on the 
Tk GUI library. 

3.2. Audio waveform display: WaveSurfer 

WaveSurfer ( [Sjolander and Beskow, 200C ) is open 
source software for displaying and manipulating audio data, 
developed by Kare Sjolander and Jonas Beskow of KTH. 
WaveSurfer requires Snack, also developed by Sjolander. 
WaveSurfer is written in Tcl/Tk and its widget, called 
wsurf, can be embedded in any Tcl/Tk application. There 
is a Python/Tkinter interface for wsurf that permits it to be 
embedded in a Python/Tkinter application. 

3.3. Video display: QuickTime 

QuickTime Tel [http://hem.fyristorg.com/ 
matben/qt/] can be embedded in a Tcl/Tk appli- 
cation. We have written a simple wrapper in or- 
der to embed QuickTime Tel in a Python/Tkinter 
application. QuickTime Tel requires the Quick- 
Time player distributed by Apple Corporation, 
[http : / / www . apple . com/ quicktime/ download/]. 
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Figure 1 : TableTrans With WaveSurfer 



4. TableTrans 

TableTrans is a spreadsheet-based annotation tool for 
audio signals. It was primarily intended for linguistic anno- 
tations, but it can be used for many kinds of observational 
coding, such as widely practiced in ethology. The tool al- 
lows users to annotate regions of signal with arbitrary fea- 
tures, such as speaker identifiers, utterance identifiers and 
transcriptions. Each row of the spreadsheet corresponds to 
a region of the signal. Each column corresponds to a user- 
defined feature, and has a user-defined width. For example, 
one could define the width of 5 characters for the speaker 
identifier, and the width of 40 for the transcription. One 
could also choose to define ten or more columns as shown 
below, and use it for more structured coding. 

Figure |l] shows TableTrans with a waveform display 
based on WaveSurfer. Figure || shows TableTrans with a 
video module based on QuickTime and QuickTime Tel. 

Setting Current Region Swiping a region in the wave- 
form display while holding down the left-mouse button 
highlights the region. This region is called the current re- 
gion. In the video version of TableTrans, the current region 
can be set with two operations: pressing controls at the be- 
ginning of the region, and pressing control-e at the end of 
the region. 



Create Annotation Pressing the Return key inserts a 
blank annotation row in the spreadsheet. If the current re- 
gion of the waveform is chosen, the start and end times of 
the current region are inserted. 

Delete Current Annotation Pressing Control-d deletes 
the currently highlighted annotation (the current annota- 
tion). 

Update Current Region Pressing Control-g updates the 
start and end times of the current annotation using the cur- 
rent region in the waveform. 

Navigate in Spreadsheet The Tab key moves the cursor 
one cell to the right, if possible. The right, left, up, and 
down arrows move to the neighboring cells in the corre- 
sponding directions. 

Navigate Within Current Cell The Shift-Right and 
Shift-Left keys move the insertion point within the current 
cell. 

Toggle Play and Stop The Fl key toggles playback and 
pausing of the recording. If the current region is chosen, 
it plays the current region. If a single point is selected in 
the waveform, Fl starts playback from the point, and an- 
notations corresponding to the audio cursor are highlighted 
(aligned playback). 
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Figure 2: TableTrans With the QuickTime Player 

Sort Annotations Double-clicking the left mouse but- 
ton on a feature name (a column heading) of the 
spreadsheet sorts the annotations according to the val- 
ues in the column. For example, double clicking on 
the cell titled 'fl' will sort all the annotations accord- 
ing to the values in the column. Sorting according to 
start and end times can also be done from the menu 
items, Trans->Sort->Sort Annotations by Start 
Time and Trans->Sort->Sort Annotations by End 
Time. 

Find The menu item Trans->Find or the Control-f key 
combination brings up a dialog window to search a string in 
the spreadsheet. If a matching string is found in the cell it 
is highlighted and the annotation row becomes the current 
annotation. 

Control View of Annotations The menu item 

Trans->View->Show Select Rows lets the user 
specify a feature and a value so that only the annotations 
having this value as the feature will be displayed. The 
other menu items in Trans->view are for displaying all 
annotations and hiding all annotations in the spreadsheet. 

Open Sound/Movie File The menu item File->Open 
Sound File loads a sound file into the waveform panel. 
It automatically adjusts the number of waveforms accord- 
ing to the number of channels in the sound file. All sound 
file formats supported by WaveSurfer are supported. In 
the video version of TableTrans, the File->Open Movie 
File will open a movie file. 

Open Annotation File The menu item File >Open 
Annotation File provides the users with options to load 



an annotation file in the following formats: XML (AIF), 
Table Format (csv, etc.) and LCF (the LDC CalUiome For- 
mat). 

The application opens appropriate dialog windows for 
each format. First, the user is prompted for the file. Then, 
for a Table format, a window for specifying names of fea- 
tures and a delimiter is opened. The names specified here 
are used as feature names and column headers. 

Save Annotations in File If a file name and a format type 
have been already chosen, the menu item File->Save 
will save all the annotations in the file. If they have not 
be chosen, the user needs to use one of the File->Save 
Annotations As menu items. 

Save Current Sound Region in File The menu item 

Sound->Save Current Region in File lets the user 
save the current region of the sound data into a file. 

Column configurations Column headers and width may 
be changed interactively, or may be specified in a configu- 
ration file. 

Database support TableTrans supports access to the 
database component of AGTK. Figure |^ shows the dialog 
window for entering a ODBC connect string. Table |l| shows 
some of the parameters used in a connect string. For a com- 
plete list, please see [http://www.mysql.eom/doc/M/ 
y/MyODBC_connect_parameters . html]. 
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Figure 3: ODBC connect string dialog 



DSN 


Registered ODBC Data Source Name. 


SERVER 


The hostname of the database server. 


UID 


User name as established on the server. 
SQL Server this is the logon name. 


PWD 


Password that corresponds with the logon name. 


DATABASE 


Database to connect to. If not given, DSN is used. 



Table 1 ; ODBC Connect String Parameters 

5. MultiTrans 

MultiTrans is a transcription tool for transcribing multi- 
party conversations in multi-channel audio signals. The 
user interface is similar to Transcriber ( Barras et al., 2001), 



but MultiTrans has one transcription panel corresponding 
to each channel in the signal. 

Figure Q contains a screenshot of MultiTrans with a 
two-channel audio signal. The left transcription panel cor- 
responds to the first channel in the audio signal, and the 
right transcription panel corresponds to the second chan- 
nel. The boxes labeled from A to E in the figure illustrates 
the following points. 

A: The text panel for speaker 2. The channel associated 
with speaker 2 is the current channel. 
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Figure 4: MultiTrans, for Transcribing Multi-Party Conversation 



B: This is the second annotation for speaker 2. It is also 
the current annotation, and is highlighted. 

C: A segment for speaker 1 . This shows some of the tran- 
script for this annotation and the associated region of 
the waveform. 

D: The highlighted region of the waveform for the second 
annotation. This is the current annotation and its entire 
waveform region is highlighted. 

E: The hollow play button. This button will play the cur- 
rent channel only, muting other speech channels. 

There are two ways to create an annotation, both are 
explained below. All annotations are inserted into the text 
panel, sorted by their starting position. 

Create Annotation (Specific) The first way to create an 
annotation is to explicitly highlight a region in the wave- 
form and press the Return key. This will create a bullet in 
the appropriate text panel that designates the created anno- 
tation. Also, a region is created below the waveform itself 
to designate the created annotation. Only once an annota- 
tion is created can the transcription of that region begin. 



Create Annotation (Non-Specific) Another way to cre- 
ate an annotation is during speech playback. When the 
playback of speech is begun the user may press the Return 
key to insert an anchor in the current channel. When Return 
is pressed a small black bar will appear below the wave- 
form. This designates the starting position of the current 
annotation. When Return is pressed a second time the end 
anchor for the annotation is inserted and the annotation is 
created. Ending speech playback destroys any start anchors 
that do not yet have an associated end anchor. 

Delete Current Annotation An annotation can be 
deleted using Control-d. To delete an annotation one must 
first select the annotation to be deleted. The current anno- 
tation can easily be distinguished from others because its 
transcription and waveform regions are highlighted. When 
an annotation is deleted its transcription and its association 
with the waveform region are also deleted. 

Cliange Current Annotation Once an annotation is cre- 
ated its region in the waveform can be changed without 

deleting the entire annotation. This is done by selecting 
the desired annotation (with a click in the text panel or seg- 
ment below the waveform), and moving either the end point 



or start point of the annotation and pressing the Return key 
to register the change. 

Split Current Annotation Large annotations can be spht 
into smaller annotations using the split current annotation 
command. First, the area in the text transcription where the 
split is to occur is selected. Next, the area in the waveform 
where the split is to occur is selected. Finally, the Return 
key is pressed and the old annotation is split into two new 
annotations, each associated with a different waveform re- 
gion. 

Join Current Annotation Joining an annotation is the 
opposite of splitting an annotation. This is done by se- 
lecting an annotation and pressing the Shift-BackSpace key 
combination. This will merge the currently selected anno- 
tation region and transcription with the one that occurs im- 
mediately before it. 

Squeeze Current Annotation When an annotation is 
squeezed its starting boundary is pushed to the ending 
boundary of the previous annotation. This is used when 
annotations are desired to be separate, but one is to begin 
as soon as the other ends. This is done by selecting an anno- 
tation and pressing the Control-Shift-BackSpace key com- 
bination. 

Toggle Speech Playback There are several ways to begin 
speech playback. Pressing the Tab key will toggle speech 
playback of the current annotation, or the entire speech file 
if there is not a current annotation. Playback can also be 
initiated by pressing the solid play button in the waveform 
panel. Either of these commands will play all channels in 
the speech file. Pressing the hollow play button located in 
the waveform panel will play the current channel only, mut- 
ing all other channels. This is useful when there are several 
channels that make speaker distinction difficult. 

6. Inter Trans 

InterTrans is an interlinear text editor Interlinear text is 
a kind of text in which each word is annotated with some 
combination of phonological, morphological and syntactic 
information (displayed under the word) and each sentence 
is annotated with a free translation. For an extended discus- 
sion of interline ar text, and how to mod el it using annota- 
tion graphs, see ( Maeda and Bird, 2000 ). 

InterTrans permits interlinear transcription aligned to a 
primary audio signal, for greater convenience, accuracy and 
accountability. Whole words and sub-parts of words can be 
easily aligned with the audio. Clicking on any part of the 
annotation causes the corresponding extent of audio signal 
to be highlighted. As an extended recording is played back, 
annotated sections are highlighted (both waveform and in- 
terlinear text displays). Figure |5] contains a screenshot of 
InterTrans. 

The linguistic levels of information such as Free Trans- 
lation (FT), Word (WD) and Morphome (MP) are called 
types. The relationships among types are defined in a con- 
figuration file. For example, if the type WD is defined to 
dominate the type MP, operations described in this section, 
such as split and join, on any cell of the type WD will cause 
corresponding cell of the type MP to split or join as well. It 
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Figure 5: InterTrans, Interlinear Text Transcription 



is also possible to define two or more types as in an equiv- 
alent class. In the implementation, this is done by using 
multi-attribute labels of arcs. For example, if the types 
Morpheme and (MP) and Morphemic Gloss (MP-GLOSS) 
are to be used, these can be defined as types of an equiv- 
alent class. Then, operations, such as Split on any of the 
cells of the type MP will cause the corresponding cell of the 
type MP-GLOSS to split as well. 

Insert New Cell A new cell can be inserted after the cur- 
rent cell by clicking on the Insert New Cell button. A nor- 
mal procedure for creating interlinear text is to use this 
function first, and then split the sub cells as necessary. 

Delete Cell Deleting a cell can be done in one of the 
following ways after selecting the current cell; use the 
Control-d key combination, or click on the Delete Cell but- 
ton. 

Split Cell Splitting a cell in InterTrans can be done in a 
similar manner to Multi Trans. First, the current cell should 
be selected. Optionally, the splitting point in the waveform 
can be chosen if the current cell is associated with a region 
in the waveform. Pressing the Return key, or clicking on 
the Split Cell button will split the current cellj^ 

Join Cell Joining two adjoining cells is simple. First, the 
current cell should be chosen. Then clicking on the Join 
Sub Cell button or pressing the Control-j key combination 
will join the current cell and its preceding cell. 

Alignment with Audio Waveform A partial or complete 
set of cells can be aligned with the audio signal displayed in 
the waveform panel. This can be done by selecting the cur- 
rent cell in the interlinear transcription panel, and selecting 
the corresponding region in the waveform panel. The align- 
ment can be changed later. 



'We plan to modify the program so that key bindings can be 
assigned by the user. 
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Figure 6: TreeTrans: Syntactic Transcription (Horizontal and Vertical Views) 



7. TreeTrans 

TreeTrans is a tool for displaying and annotating tree 
structures. I Basic annotation functions are provided, and 
functions can be easily added or changed for specific an- 
notation projects. Trees can be displayed in three forms: 
bottom-up, top-down, and vertical. Two views of trees in 
TreeTrans are shown in Figure ^. Every node of the dis- 
played tree has one of three types: syn, a syntactic or in- 
ternal node; pos, a part-of-speech node; or wrd, a node 
containing original text. 

To change a displayed tree, users highlight the appropri- 
ate node(s) then choose a function from the list of buttons. 
All the tree manipulation functions preserve the surface or- 
der of the word string. 

Input/Output TreeTrans can input either Penn Treebank- 
style bracketed trees or annotation graphs. Likewise, it can 
store annotated trees in either of these formats. 

InsertlnternalNode If one node is highlighted, this func- 
tion adds an internal node as the parent of the highlighted 
node. If two nodes are highlighted, this function adds an 
internal node as the parent of the the two highlighted nodes 
and any intervening material. If the branches of the new 
internal node would cross any existing branches of the tree, 
the insertion is not performed. 

Delete This deletes the highlighted node. If the high- 
lighted node has type pos, its corresponding wrd node is 
also deleted. If the deletion would cause an internal node 
to have no leaves, the deletion is not performed. 



MoveNode When two nodes are highlighted, this moves 
the first highlighted node and its subtree to become a child 
of the second highlighted node. When three nodes are high- 
lighted, the first two nodes and any intervening material are 
moved to become children of the third highlighted node. 
The move is not performed if it would cause word order to 
be changed, or if it would result in an internal node with no 
leaves. 

Adjoin This is a syntactic adjoin, which creates a clone 
of the second highlighted node, then moves the first high- 
lighted node to become a child of the clone. 

SynWrdBefore/After This function adds a pair of nodes, 
one of type syn, and one of type wrd. It is used for adding 
traces to the text. The new syn node is a previous or fol- 
lowing sibling of the highlighted node. 

ChangeLabel This changes the label of the highlighted 
node. 

CoRef This gives two highlighted nodes the same trace 
number. 

BuildDefaultTree This builds a basic tree structure using 
the input sentence as the terminals of the tree. 

8. Obtaining the tools 

All of the annotation tools described in this paper are 
distributed under an open source license , and available from 
[http : / /www . Idc . upenn . edu/AG/]. 
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